Reference databases for taxonomic assignment in metagenomics

نویسندگان

  • Monica Santamaria
  • Bruno Fosso
  • Arianna Consiglio
  • Giorgio De Caro
  • Giorgio Grillo
  • Flavio Licciulli
  • Sabino Liuni
  • Marinella Marzano
  • Daniel Alonso-Alemany
  • Gabriel Valiente
  • Graziano Pesole
چکیده

Metagenomics is providing an unprecedented access to the environmental microbial diversity. The amplicon-based metagenomics approach involves the PCR-targeted sequencing of a genetic locus fitting different features. Namely, it must be ubiquitous in the taxonomic range of interest, variable enough to discriminate between different species but flanked by highly conserved sequences, and of suitable size to be sequenced through next-generation platforms. The internal transcribed spacers 1 and 2 (ITS1 and ITS2) of the ribosomal DNA operon and one or more hyper-variable regions of 16S ribosomal RNA gene are typically used to identify fungal and bacterial species, respectively. In this context, reliable reference databases and taxonomies are crucial to assign amplicon sequence reads to the correct phylogenetic ranks. Several resources provide consistent phylogenetic classification of publicly available 16S ribosomal DNA sequences, whereas the state of ribosomal internal transcribed spacers reference databases is notably less advanced. In this review, we aim to give an overview of existing reference resources for both types of markers, highlighting strengths and possible shortcomings of their use for metagenomics purposes. Moreover, we present a new database, ITSoneDB, of well annotated and phylogenetically classified ITS1 sequences to be used as a reference collection in metagenomic studies of environmental fungal communities. ITSoneDB is available for download and browsing at http://itsonedb.ba.itb.cnr.it/.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Assessment of Common and Emerging Bioinformatics Pipelines for Targeted Metagenomics

Targeted metagenomics, also known as metagenetics, is a high-throughput sequencing application focusing on a nucleotide target in a microbiome to describe its taxonomic content. A wide range of bioinformatics pipelines are available to analyze sequencing outputs, and the choice of an appropriate tool is crucial and not trivial. No standard evaluation method exists for estimating the accuracy of...

متن کامل

Accurate Taxonomic Assignment of Short Pyrosequencing Reads

Ambiguities in the taxonomy dependent assignment of pyrosequencing reads are usually resolved by mapping each read to the lowest common ancestor in a reference taxonomy of all those sequences that match the read. This conservative approach has the drawback of mapping a read to a possibly large clade that may also contain many sequences not matching the read. A more accurate taxonomic assignment...

متن کامل

SOrt-ITEMS: Sequence orthology based approach for improved taxonomic estimation of metagenomic sequences

MOTIVATION One of the first steps in metagenomic analysis is the assignment of reads/contigs obtained from various sequencing technologies to their correct taxonomic bins. Similarity-based binning methods assign a read to a taxon/clade, based on the pattern of significant BLAST hits generated against sequence databases. Existing methods, which use bit-score as the sole parameter to ascertain th...

متن کامل

Taxator-tk: precise taxonomic assignment of metagenomes by fast approximation of evolutionary neighborhoods

MOTIVATION Metagenomics characterizes microbial communities by random shotgun sequencing of DNA isolated directly from an environment of interest. An essential step in computational metagenome analysis is taxonomic sequence assignment, which allows identifying the sequenced community members and reconstructing taxonomic bins with sequence data for the individual taxa. For the massive datasets g...

متن کامل

GSTaxClassifier: a genomic signature based taxonomic classifier for metagenomic data analysis

GSTaxClassifier (Genomic Signature based Taxonomic Classifier) is a program for metagenomics analysis of shotgun DNA sequences. The program includes a simple but effective algorithm, a modification of the Bayesian method, to predict the most probable genomic origins of sequences at different taxonomical ranks, on the basis of genome databases;a function to generate genomic profiles of reference...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Briefings in bioinformatics

دوره 13 6  شماره 

صفحات  -

تاریخ انتشار 2012